Making C Programs

Make Facility
C Control Statements
Basic C Data Types
Arrays in C
Structures in C
Left and Right Values
Procedures & Functions

Make

The make facility is an important tool that you will learn ito use n this course. It will be valuable in all of your computing science courses
make can automatically construct the executable form of a program from a description of the source files and the dependencies between those files
Once you have produced a Makefile for a program you don't need to remember how to compile it, or even which files have changed, make handles all of this automatically for you
The main input to make is a Makefile, this is basically a description of how to build and executable system
The Makefile is divided into two main parts:

macro definitions
dependency declarations

Macro Definitions

Macros are used extensively in C programming, get used to them
A macro is a text replacement facility
A macro has a name, which is a character string, and a body, which is also a character string, wherever the name occurs in the file it is replaced by the body of the macro
In make we use macros to describe the compiler we are using, the compiler options, and the list of files we need. It allows us to parameterize the Makefile
A macro definition has the following format:

    macro-name = macro-body

A macro definition can appear anywhere in a Makefile
In make, a macro is invoked by mentioning its name, preceded by a $, if the macro is more than one character long the name must be placed in parenthesis
In our example Makefile we have

    FILES = main.o phone.o

When we use $(FILES) in the rest of the Makefile it is replaced by the two file names, as we add more files to the program we only need to change the line where FILES is declared, the rest of the Makefile stays the same - this saves time and reduces mistakes
Similarly we have the lines

    CC  =gcc
    CFLAGS = -g

The first line defines the C compiler that we use and the second line defines the compiler flags. Again we can quickly change the compiler options without scanning through the entire Makefile

Dependency Declarations

The dependency declarations tell make which files depend on which other files
In the case of our example, the file phone (the executable for our program) depends on both main.o and phone.o, we say this in the following way:

    phone: main.o phone.o

The files on the left of the : depend on the files listed on the right. Here that means that the executable module 'phone' depends on the two object modules, main.o and phone.o
Similarly, both main.o and phone.o depend on phone.h, there are two ways that we could specify this

    main.o : phone.h

    phone.o : phone.h

or

    main.o phone.o : phone.h

After each dependency declaration we can list the commands that construct the dependent files, these commands must be indented, there must be either spaces or tabs at the beginning of the line. WARNING: Old versions of make (i.e. the version in your lab.) require tabs.
So in order to create phone, we use the following specification

    phone : main.o phone.o
            gcc -o phone main.o phone.o

When we need to re-create phone, make will automatically execute the gcc command
When is a dependency declaration used?
When make is started it looks at the time of last modification of all the files mentioned in the Makefile, if the last time of modification of a file on the right side of a dependency declaration is more recent than one of the files on the left side then the commands are executed
In our example, if either main.o or phone.o is more recent than phone, the gcc command will be executed, otherwise make knows that phone is up-to-date and does nothing
How does make know how to make main.o and phone.o, there are no commands associated with their dependencies?
Make has a set of default rules that know about Unix's file naming conventions, we use suffixes to indicate the type of file, start with a dot ( . ) and have one or more letters, the common ones are:

.c a c program
.h a header file
.o an object file
.s an assembler file

Make uses the file suffixes to determine how to make a file, if make needs a .o file and can find a .c file with the same prefix, it knows that it can use the C compiler to produce the .o file
For example if name.o is required, mentioned on the right side of a dependency rule, but there is no rule for name.o then make will look for a file named name.c, and if it finds one will run the c compiler on it, the use of default rules saves on the amount of typing you must do--BUT DON'T RELY ON THEM IN C201
You can construct your own default rules
If make is started without arguments, it will use the first dependency declaration as its target, that is it will make the file on the left side of the first dependency rule in the Makefile
In our example Makefile, phone is the first file mentioned, so by default make will execute that command first
You can specify the target when you call make, if we enter the command

     make main.o

then make will only run the c compiler on main.c to produce main.o, it will not produce a new version of phone
The most common file that you produce should be in the first dependency declaration, so you don't need to mention it each time you use make
Besides making the executable of a program there are other standard things that are placed in a Makefile, some of these operations include cleaning up temporary files, installing the executable in a standard place, running standard test cases, and printing the program
We could add the following rules to our Makefile:

    clean: $(FILES)
               rm $(FILES)

    install: phone
               cp phone $HOME/bin/phone

When we execute the command

    make clean

C Control Statements

There are a number of control structures in C, we will briefly list them here:

Compound statement

    {    statements   }

Conditional Statement

    if ( expression )
        statement


    if ( expression )                       if ( expression )                   if (expression {
        statement                               {   statements   }                  statements }
    else                                    else                                else {
        statement                               {   statements   }                  statements }

While statement

    while ( expression )                   while ( expression )
        statement                               {   statements   }

Do-while or Do-until statement

    do
        statement
    while ( expression ) ;

For statement

    for ( expression1 ; expression2 ; expression3 )
        statement

expression1 is evaluated once at the start of the loop
expression2 is evaluated at the start of each iteration, if the value of this expression is 0 the loop is exited
expression3 is executed at the end of each iteration

    expression1;
    while ( expression2 ) {
        statement
        expression3;
    }

Switch statement

    switch ( expression ) {
        case constant-expression : statement
        case constant-expression : statement
            .
            .
            .
        default : statement
    } ;

must

must also

NOT

Break

    break ;

Continue

    continue ;

Example

In the example we use the following strategy to read from a file or terminal

    read first line
    while ( not end ) {
        process the line
        read next line
    }

The main problem with this approach is the need for two read statements, a better schema (plan) uses the break statement

    while ( true ) {
        read next line
        if ( end )
            break;
        process the line
    }

Press here to see the example program

Basic C Types

C has a small number of basic data types, which are:

char a character
int an integer
float floating point number
double floating point number

I also consider pointers to be a basic data type.

A pointer is always the size of a machine address and is treated like an unsigned integer

There are three modifiers that can be applied to most of the basic data types
The long modifier specifies that the maximum number of bits is used to represent the value, for example on most machines a float is 32 bits and a double is 64 bits, in reality a double is a long float

The long modifier is often used with the int type. A long int uses the maximum number of bits for an integer. On most machines an int and a long int are often the same data type, but this is not always the case

The short modifier is used to specify that the minimum number of bits should be used, it is usually used with int's - a short int is 16 bits on most machines
The unsigned modifier can be used with character and integer data types, an unsigned value doesn't interpret the sign bit as a sign, it is used as part of the value, in other words unsigned values are always positive and use all the bits in the value - this is used in bit manipulation operations
A variable declaration has the following format:

    type variable_name = initial_value ;

The initial value part of the declaration is optional

Several variables can be declared at the same time, the variable names are separated by commas
A pointer is declared in the following way

    type* variable_name ;

The type specifies value that is being pointed to

Constants

Character constants are enclosed in single quotes, the \ is used as an escape character - for example '\037' is the character with octal value 37 (i.e a %), while '\n' is the end of line character, '\012'.
An integer constant is a number without a decimal point, a long integer constant is an integer constant with L as a suffix - for example 123L is a long integer constant
An integer that starts with the digit 0 is a non-decimal constant. If the 0 is followed by another digit it is an octal constant - for example 037 is an octal integer constant
If the 0 is followed by an x or X, the constant is a hexadecimal integer constant - for example 0x1f is a hex constant (its decimal value is 1*16^1 + 15*16^0 = 31
Double or floating point constants are numbers that have decimal points or exponents

Arrays

Arrays and pointers are very closely related in C
C basically supports one dimensional arrays, the first subscript value is always 0
An array is declared in the following way:

    type array_variable [ size ] ;

The type is the type of the individual array elements and size is the length of the array
For example

    int a[1000];

Arrays are initialized in the same way as basic data types, except a list of values must be specified
For example

    int a[10] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 };

A text string is a one dimensional array of char elements
For example, a text string can be declared in the following way:

    char string[25];

A character string constant is enclosed in double quotes ("), for example "this is a text string" - there is a major difference between single and double quotes
In C and Unix a text string is terminated by a zero byte, this is written as '\0'. Thus an empty string requires one byte of storage. In general a string with n characters requires n+1 bytes of storage - be careful to allocate the extra byte

Arrays and Pointers

An array name is really a pointer, it points to the first element of the array
For example:

    int a[100];
    int* pa;

    pa = a;

Both pa and a point to the first element of the array of 100 integers, both of the following reference the same array element:

    a[i]   and *(pa+i)

The expression pa+i, takes the value of the pointer pa and adds i elements to it, thus pa+i points to the i'th element of the array a, the * operator treats its operand as a pointer and retrieves the value at that address, thus *(pa+i) first computes the address of element i, and then retrieves its value
The & operator is used to compute the address of a variable, for example

    pa = &a[5];

    *(pa+2)

There is an important distinction between array and pointer declarations, an array declaration allocates storage for the array while a pointer declaration only allocates enough storage to store the pointer itself, no storage for the value pointed to is allocated
We can of course have an array of pointers, it is declared in the following way

    int* pa[10];

This produces an array containing 10 pointers to integers

Multi-Dimensional Arrays

A multi-dimensional array is just an array of arrays
Thus we can declare a 2 dimensional array in the following way:

    int a[10][20];

This is in fact 10 arrays, each of which contains 20 integer elements
An individual element of this array can be referenced in the following way

    a[i][j]

Of course we can still do things like the following

    int a[10][20];
    int* pa;

    pa = a;

    x = *(pa+i*20+j);

Similarly we can do things like:

    int a[10][20];
    int (*pa)[20];

    pa = &a[0];

Then (*pa)[5] would be the same as a[0][5]
An observation, a variable is declared in the way that it is used
The type in a variable declaration is usually a basic type, the declaration syntax shows how a value of that basic type can be obtained from the variable name
Also note that [] has higher precedence than *, so we need to use parenthesis in the above declaration, otherwise we would have an array of 20 pointers to integers instead of a pointer to an array of 20 integers

Character Arrays and Initialization

An array of pointers to text strings (which is not the same as a 2 dimensional array of characters) is often used in C programming, such a structure can be initialized in the following way:

char* name[] = {
    "fred",
    "george",
    "paul",
    "mark",
    NULL
};

The last line of the initialization isn't necessary, it serves as a marker for the end of the array, we can detect the end of the array in the following way:

for(i=0; i<1000; i++) {
    if(name[i] == NULL)
        break;
}

Structures

C structures are similar to records in Pascal, they allow us to collect together several pieces of related data into one data structure, the individual pieces of data are called structure elements or structure members
A structure is declared in the following way:

    struct structure_name {
        member declaration;
        member declaration;
                .
                .
                .
        member declaration;
    };

This declaration doesn't allocate any memory, it just provides a template for the structure, it declares the values that can be stored together
The individual structure elements can be of any C type, including other structures
For example, we could have the following for the declaration of a name structure

    struct name {
        char* first;
        char* last;
    };

The name structure has two elements, the character pointers first and last, variables can have the same names as structure elements and the same element name can be used in different structure declarations
There are several ways that we can declare a variable that has a structure type
One way is to do the following:

    struct structure_name variable_name;

So with our example name struct we could do the following:

    struct name fred, george;

Another way of doing this is:

    struct structure_name {
        member declarations
    } variable_name;

In this approach we combine the structure declaration with the variable declaration, the structure name is optional, but it is always a good idea to include it
A structure variable can also be initialized, this can be done in the following way:

    struct structure_name variable_name = { element values } ;

So for our example we could have:

    struct name george = { "george", "brown" };

The element values are assigned in the same order as the element declarations
The . (dot) operator is used to extract the individual elements from a structure value, in the case of our name structure we can do the following:

    george.first
    george.last

In general the syntax is

    variable_name . element_name

A slightly different syntax is used for pointers to structures, for example

    struct name* person;

The variable person is now a pointer to a name structure, knowing what we know about pointers we can use the following to get the value of the first element of the structure that person points to

    (*person).first

or
    person->first

The -> operator takes a pointer to a struct, follows the pointer to the structure value and then extracts the field - this is a shorthand, but it makes sense
We can have arrays of structures, which is often quite convenient, this is done in the following way:

    struct structure_name variable_name [ size ];

In the case of our name structure, we could have an initialized array of names constructed in the following way:

    struct name persons[] = {
        { "george", "brown" },
        { "fred", "black" },
                 .
                 .
                 .
        { NULL, NULL }
    };

Again we use explicit NULL pointers at the end of the array to indicate tre are no more elements. There are other ways of doing this, but this is the safest
We can include pointers to a structure within the declaration of the structure, we use this technique to build linked lists and binary trees
We can use the following structure declaration for a node in a binary tree

    struct node {
        int value;
        struct node* left;
        struct node* right;
    };

Note that left and right must be pointers to structures, they cannot be structure variables, otherwise we will have a structure that includes two copies of itself
The same thing can be done for a linked list:

    struct list_node {
        float value;
        struct list_node* next;
    };

Fields

The structure elements that we have seen so far have been standard C data types, these data types may not be the most efficient way of storing data in a structure
Fields allow us to pack data into a structure as densely as possible, a field is an integer value (either signed or unsigned) where the programmer specifies the number of bits occupied by the field value
We can define fields in the following way:

    struct example {
        int field_a : 4;
        int field_b : 6;
        unsigned field_c : 6;
    };

In this structure all three fields would be packed into a 16 bit word, the first two fields are signed and the last one is unsigned

Lvalues and Rvalues

Lvalues and Rvalues are important in understanding expressions in C
An Lvalue is anything that can be on the left side of an assignment operators, in other words it represents a memory location where a value can be stored - Lvalues include variables and pointer expressions
An Rvalue is anything that can be on the right side of an assignment operator, in other words it represents an expression or value
All expressions are Rvalues, but only some of them can be used as Lvalues, in other words any place that an Rvalue is required an Lvalue can be used, but the opposite is not true

Assignment Expression

There are several forms of assignment expressions, note that assignment is an expression, it has a value and can be used as part of a larger expression
The general format of an assignment expression is

    Lvalue assignment_operator Rvalue

The standard assignment_operator is =, but there are several other useful ones, such as:

    +=
    -=
    *=
    /=

These operators are interpreted in the following way:

    x op=y       

is the same as       

    x = (x   op  y)

Operators

C has the standard arithmetic operators:

    +
    -
    *
    /
    %   -  modulus or remainder

The standard comparison operators are:

    ==
    !=
    <
    <=
    >
    >=

Logical Operators

Recall, that in C zero is treated as false and non-zero is treated as true
The ! is the logical invert operator, if the operand is non-zero the result is zero and if the operand is zero the result is 1
There are two binary logical operators:

    &&  -  logical and
    ||  -  logical or

These operators don't always evaluate their second operand. In the case of &&, if the first operand evaluates to zero, the second operand is not evaluated since the result is already zero
Similarly for ||, if the first operand evaluates to a non-zero value the second operand won't be evaluated since the result will be 1
The allows us to write doubtful expressions like:

    if(n != 0 && m/n > 5)   ...

Increment and Decrement Operators

The ++ and -- operators are used to increment and decrement Lvalues, they can be used as both a prefix or postfix operator
If they are used as prefix operators the value of the expression is the new value of the Lvalue, for example

    m = ++n;

is the same as

    n = n+1;
    m = n;

If they are used as a postfix operator, the value of the expression is the value of the Lvalue before the operation is performed, for example

    m = n++;

is the same as
    m = n;
    n = n+1;

Conditional Expression

A conditional expression has the following syntax:

    expression1 ? expression2 : expression3

First expression1 is evaluated, if it is non-zero then expression2 is evaluated and used as the value of the expression, in this case expression3 is ignored
If expression1 is zero then expression3 is evaluated and used as the value of the expression, in this case expression2 is ignored
This operator can be used in the following way

    x = n != 0 ? m/n : 0;

The conditional operator essentially allows the programmer to put an if statement in the middle of an expression

Procedures - Part 1

There are two ways of declaring and defining procedures in C, the old way and the ANSI standard way - you will run into both so we will cover both
All procedures in C are really functions, that is they return a value, the special type void is used to indicate that the return value is never used, therefore, one is not produced
C has both procedure declarations and procedure definitions - these are two different concepts
A procedure declaration contains all the information required to call the procedure, that is the name of the procedure, the types of its parameters (optional) and the type of the return value
A procedure definition includes all the information in a procedure declaration, plus local variables and the statements in the procedure - it not only describes how the procedure can be called, but also how it computes its value

Procedure Declarations

The old style of procedure declaration is:

    type procedure_name();

The type is the type of the return value
The ANSI style of procedure declaration is:

    type procedure_name(parameter_declarations);

The parameter declarations are separated by commas and each declaration has the following format:

    type parameter_name

This is the same format as variable declarations
The ANSI style procedure declarations should be used

Procedure Definitions

The old style of procedure definition is:

    type procedure_name(parameters)
       parameter declaration;
       parameter declaration;
        .
        .
        .
    parameter declaration; {
        variable declarations

        statements

    }

The parameters are a comma separated list of parameter names. The parameter declarations need not be in the same order as the parameter_names
T advantage of this format is that there is a separate line for each parameter, ao they are easier to document
The ANSI style of procedure definition is:

    type procedure_name(parameter_declarations) {
        variable declarations

        statements
    }

In all cases the return statement is used to specify the value returned by the procedure and return control to the calling procedure
The two formats of the return statement are:

    return;

    return(Rvalue);

The first form is only used with procedures of type void

Parameter Passing

All parameter passing in C is by value, that is, when a procedure is called the parameter values from the calling procedure are copied into temporary storage in the called procedure - all modifications to the parameter values occurs in this temporary storage, the original values in the calling procedure are not changed
This means that you cannot return a result directly through a parameter. The indirect approach is to use a parameter that points to the variable where the result should be stored
Remember arrays are the same as pointers, so if you pass an array (not an array element) to a procedure, you can modify the elements in the array, and these modifications will be seen outside of the procedure

foo(int x) {

    we can do anything we like to x inside this procedure
    the calling procedure won't see any of these changes
    the calling procedure only provides the initial value of x

}

foo(int* x) {

    *x = 5;

}

    foo(&i)
    printf("%d",i);

This will print 5, since we have passed a pointer to i (that is, &i) into the procedure. The value pointed at is changed within the procedure - NOTE that foo(i) will cause all sorts of problems if i isn't a pointer